NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Autonomous Drifting with 3 Minutes of Data via Learned Tire Models

https://doi.org/10.1109/ICRA48891.2023.10161370

Djeumou, Franck; Goh, Jonathan YM; Topcu, Ufuk; Balachandran, Avinash (May 2023, IEEE)

Near the limits of adhesion, the forces generated by a tire are nonlinear and intricately coupled. Efficient and accurate modelling in this region could improve safety, especially in emergency situations where high forces are required. To this end, we propose a novel family of tire force models based on neural ordinary differential equations and a neural-ExpTanh parameterization. These models are designed to satisfy physically insightful assumptions while also having sufficient fidelity to capture higher-order effects directly from vehicle state measurements. They are used as drop-in replacements for an analytical brush tire model in an existing nonlinear model predictive control framework. Experiments with a customized Toyota Supra show that scarce amounts of driving data – less than three minutes – is sufficient to achieve high-performance autonomous drifting on various trajectories with speeds up to 45mph. Comparisons with the benchmark model show a 4x improvement in tracking performance, smoother control inputs, and faster and more consistent computation time.
more » « less
Full Text Available
Task-Guided Inverse Reinforcement Learning under Partial Information

https://doi.org/10.1609/icaps.v32i1.19785

Djeumou, Franck; Cubuktepe, Murat; Lennon, Craig; Topcu, Ufuk (June 2022, Proceedings of the International Conference on Automated Planning and Scheduling)

We study the problem of inverse reinforcement learning (IRL), where the learning agent recovers a reward function using expert demonstrations. Most of the existing IRL techniques make the often unrealistic assumption that the agent has access to full information about the environment. We remove this assumption by developing an algorithm for IRL in partially observable Markov decision processes (POMDPs). The algorithm addresses several limitations of existing techniques that do not take the information asymmetry between the expert and the learner into account. First, it adopts causal entropy as the measure of the likelihood of the expert demonstrations as opposed to entropy in most existing IRL techniques, and avoids a common source of algorithmic complexity. Second, it incorporates task specifications expressed in temporal logic into IRL. Such specifications may be interpreted as side information available to the learner a priori in addition to the demonstrations and may reduce the information asymmetry. Nevertheless, the resulting formulation is still nonconvex due to the intrinsic nonconvexity of the so-called forward problem, i.e., computing an optimal policy given a reward function, in POMDPs. We address this nonconvexity through sequential convex programming and introduce several extensions to solve the forward problem in a scalable manner.This scalability allows computing policies that incorporate memory at the expense of added computational cost yet also outperform memoryless policies. We demonstrate that, even with severely limited data, the algorithm learns reward functions and policies that satisfy the task and induce a similar behavior to the expert by leveraging the side information and incorporating memory into the policy.
more » « less
Full Text Available
Taylor-Lagrange Neural Ordinary Differential Equations: Toward Fast Training and Evaluation of Neural ODEs

https://doi.org/10.24963/ijcai.2022/405

Djeumou, Franck; Neary, Cyrus; Goubault, Eric; Putot, Sylvie; Topcu, Ufuk (July 2022, Thirty-First International Joint Conference on Artificial Intelligence)

Neural ordinary differential equations (NODEs) -- parametrizations of differential equations using neural networks -- have shown tremendous promise in learning models of unknown continuous-time dynamical systems from data. However, every forward evaluation of a NODE requires numerical integration of the neural network used to capture the system dynamics, making their training prohibitively expensive. Existing works rely on off-the-shelf adaptive step-size numerical integration schemes, which often require an excessive number of evaluations of the underlying dynamics network to obtain sufficient accuracy for training. By contrast, we accelerate the evaluation and the training of NODEs by proposing a data-driven approach to their numerical integration. The proposed Taylor-Lagrange NODEs (TL-NODEs) use a fixed-order Taylor expansion for numerical integration, while also learning to estimate the expansion's approximation error. As a result, the proposed approach achieves the same accuracy as adaptive step-size schemes while employing only low-order Taylor expansions, thus greatly reducing the computational cost necessary to integrate the NODE. A suite of numerical experiments, including modeling dynamical systems, image classification, and density estimation, demonstrate that TL-NODEs can be trained more than an order of magnitude faster than state-of-the-art approaches, without any loss in performance.
more » « less
Full Text Available
Learning to Reach, Swim, Walk and Fly in One Trial: Data-Driven Control with Scarce Data and Side Information

Djeumou, Franck.; Topcu, Ufuk. (January 2022, 4th Annual Conference on Learning for Dynamics and Control)
Firoozi, R.; Mehr, N.; Yel, E.; Antonova, R.; Bohg, J.; Schwager, M.; Kochenderfer, M. (Ed.)
We develop a learning-based control algorithm for unknown dynamical systems under very severe data limitations. Specifically, the algorithm has access to streaming and noisy data only from a sin- gle and ongoing trial. It accomplishes such performance by effectively leveraging various forms of side information on the dynamics to reduce the sample complexity. Such side information typically comes from elementary laws of physics and qualitative properties of the system. More precisely, the algorithm approximately solves an optimal control problem encoding the system’s desired be- havior. To this end, it constructs and iteratively refines a data-driven differential inclusion that contains the unknown vector field of the dynamics. The differential inclusion, used in an interval Taylor-based method, enables to over-approximate the set of states the system may reach. Theo- retically, we establish a bound on the suboptimality of the approximate solution with respect to the optimal control with known dynamics. We show that the longer the trial or the more side infor- mation is available, the tighter the bound. Empirically, experiments in a high-fidelity F-16 aircraft simulator and MuJoCo’s environments illustrate that, despite the scarcity of data, the algorithm can provide performance comparable to reinforcement learning algorithms trained over millions of environment interactions. Besides, we show that the algorithm outperforms existing techniques combining system identification and model predictive control.
more » « less
Full Text Available
Blending Controllers via Multi-Objective Bandits

https://doi.org/10.23919/ACC53348.2022.9867486

Gohari, Parham; Djeumou, Franck; Vinod, Abraham P.; and Topcu, Ufuk (June 2022, Proceedings of the American Control Conference)

Full Text Available
Neural Networks with Physics-Informed Architectures and Constraints for Dynamical Systems Modeling

Djeumou, Franck.; Neary, Cyrus.; Goubault, Eric.; Putot, Sylvie.; Topcu, Ufuk. (January 2022, 4th Annual Conference on Learning for Dynamics and Control)
Firoozi, R.; Mehr, N.; Yel, E.; Antonova, R.; Bohg, J.; Schwager, M.; Kochenderfer, M. (Ed.)
Effective inclusion of physics-based knowledge into deep neural network models of dynamical sys- tems can greatly improve data efficiency and generalization. Such a priori knowledge might arise from physical principles (e.g., conservation laws) or from the system’s design (e.g., the Jacobian matrix of a robot), even if large portions of the system dynamics remain unknown. We develop a framework to learn dynamics models from trajectory data while incorporating a priori system knowledge as inductive bias. More specifically, the proposed framework uses physics-based side information to inform the structure of the neural network itself, and to place constraints on the values of the outputs and the internal states of the model. It represents the system’s vector field as a composition of known and unknown functions, the latter of which are parametrized by neural networks. The physics-informed constraints are enforced via the augmented Lagrangian method during the model’s training. We experimentally demonstrate the benefits of the proposed approach on a variety of dynamical systems – including a benchmark suite of robotics environments featur- ing large state spaces, non-linear dynamics, external forces, contact forces, and control inputs. By exploiting a priori system knowledge during training, the proposed approach learns to predict the system dynamics two orders of magnitude more accurately than a baseline approach that does not include prior knowledge, given the same training dataset.
more » « less
Full Text Available

Search for: All records